Three Connectionist Implementations of Dynamic Programming for Optimal Control: A Preliminary Comparative Analysis

نویسندگان

Hugues Bersini

Vittorio Gorrini

چکیده

Three optimal control methodologies all relying on neural network for their universal approximation capabilities and on dynamic programming for substituting the time-integral optimization by a succession of time-local optimizations are presented in this paper and applied on the same elementary Rendez-Vous problem. First a simplified version of the Back-PropagationThrough-Time algorithm is presented as the most faithful implementation of dynamic programming when the optimal controller is approximated by neural network (learning by gradient descent) and the process model is available. Relaxing the need for an explicit prior modelling of the process model, Reinforcement Learning (RL) approaches, both for continuous and discrete controllers, are described and tested on the Rendez-Vous problem. The results and the numerous methodological difficulties we met are discussed. The most successful Reinforcement Learning is the connectionist implementation of Q-learning with all Q-values approximated by RadialBasis-Function networks. However when searching for a continuous optimal controller, the price RL has to pay for the absence of model turns out to be far from negligible in terms of methodological difficulties, lack of robustness, convergence time and quality of the discovered solution.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A numerical approach for optimal control model of the convex semi-infinite programming

In this paper, convex semi-infinite programming is converted to an optimal control model of neural networks and the optimal control model is solved by iterative dynamic programming method. In final, numerical examples are provided for illustration of the purposed method.

متن کامل

Stochastic Dynamic Programming with Markov Chains for Optimal Sustainable Control of the Forest Sector with Continuous Cover Forestry

We present a stochastic dynamic programming approach with Markov chains for optimal control of the forest sector. The forest is managed via continuous cover forestry and the complete system is sustainable. Forest industry production, logistic solutions and harvest levels are optimized based on the sequentially revealed states of the markets. Adaptive full system optimization is necessary for co...

متن کامل

Extracting Dynamics Matrix of Alignment Process for a Gimbaled Inertial Navigation System Using Heuristic Dynamic Programming Method

In this paper, with the aim of estimating internal dynamics matrix of a gimbaled Inertial Navigation system (as a discrete Linear system), the discretetime Hamilton-Jacobi-Bellman (HJB) equation for optimal control has been extracted. Heuristic Dynamic Programming algorithm (HDP) for solving equation has been presented and then a neural network approximation for cost function and control input ...

متن کامل

A Multi-Stage Single-Machine Replacement Strategy Using Stochastic Dynamic Programming

In this paper, the single machine replacement problem is being modeled into the frameworks of stochastic dynamic programming and control threshold policy, where some properties of the optimal values of the control thresholds are derived. Using these properties and by minimizing a cost function, the optimal values of two control thresholds for the time between productions of two successive nonco...

متن کامل

A DSS-Based Dynamic Programming for Finding Optimal Markets Using Neural Networks and Pricing

One of the substantial challenges in marketing efforts is determining optimal markets, specifically in market segmentation. The problem is more controversial in electronic commerce and electronic marketing. Consumer behaviour is influenced by different factors and thus varies in different time periods. These dynamic impacts lead to the uncertain behaviour of consumers and therefore harden the t...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1996

Three Connectionist Implementations of Dynamic Programming for Optimal Control: A Preliminary Comparative Analysis

نویسندگان

چکیده

منابع مشابه

A numerical approach for optimal control model of the convex semi-infinite programming

Stochastic Dynamic Programming with Markov Chains for Optimal Sustainable Control of the Forest Sector with Continuous Cover Forestry

Extracting Dynamics Matrix of Alignment Process for a Gimbaled Inertial Navigation System Using Heuristic Dynamic Programming Method

A Multi-Stage Single-Machine Replacement Strategy Using Stochastic Dynamic Programming

A DSS-Based Dynamic Programming for Finding Optimal Markets Using Neural Networks and Pricing

عنوان ژورنال:

اشتراک گذاری